1,006 research outputs found

    Deductive and Analogical Reasoning on a Semantically Embedded Knowledge Graph

    Full text link
    Representing knowledge as high-dimensional vectors in a continuous semantic vector space can help overcome the brittleness and incompleteness of traditional knowledge bases. We present a method for performing deductive reasoning directly in such a vector space, combining analogy, association, and deduction in a straightforward way at each step in a chain of reasoning, drawing on knowledge from diverse sources and ontologies.Comment: AGI 201

    The use of thematic mapper data for land cover discrimination: Preliminary results from the UK SATMaP programme

    Get PDF
    The principal objectives of the UK SATMaP program are to determine thematic mapper (TM) performance with particular reference to spatial resolution properties and geometric characteristics of the data. So far, analysis is restricted to images from the U.S. and concentrates on spectra and radiometric properties. The results indicate that the data are inherently three dimensional compared with the two dimensional character of MSS data. Preliminary classification results indicate the importance of the near infrared band (TM 4), at least one middle infrared band (TM 5 or TM 6) and at least one of the visible bands (preferably either TM 3 or TM 1). The thermal infrared also appears to have discriminatory ability despite its coarser spatial resolution. For band 4 the forward and reverse scans show somewhat different spectral responses in one scene but this effect is absent in the other analyzed. From examination of the histograms it would appear that the full 8-bit quantization is not being effectively utilized for all the bands

    Neural Distributed Autoassociative Memories: A Survey

    Full text link
    Introduction. Neural network models of autoassociative, distributed memory allow storage and retrieval of many items (vectors) where the number of stored items can exceed the vector dimension (the number of neurons in the network). This opens the possibility of a sublinear time search (in the number of stored items) for approximate nearest neighbors among vectors of high dimension. The purpose of this paper is to review models of autoassociative, distributed memory that can be naturally implemented by neural networks (mainly with local learning rules and iterative dynamics based on information locally available to neurons). Scope. The survey is focused mainly on the networks of Hopfield, Willshaw and Potts, that have connections between pairs of neurons and operate on sparse binary vectors. We discuss not only autoassociative memory, but also the generalization properties of these networks. We also consider neural networks with higher-order connections and networks with a bipartite graph structure for non-binary data with linear constraints. Conclusions. In conclusion we discuss the relations to similarity search, advantages and drawbacks of these techniques, and topics for further research. An interesting and still not completely resolved question is whether neural autoassociative memories can search for approximate nearest neighbors faster than other index structures for similarity search, in particular for the case of very high dimensional vectors.Comment: 31 page

    A Comprehensive Bioinformatics Analysis of the Nudix Superfamily in Arabidopsis thaliana

    Get PDF
    Nudix enzymes are a superfamily with a conserved common reaction mechanism that provides the capacity for the hydrolysis of a broad spectrum of metabolites. We used hidden Markov models based on Nudix sequences from the PFAM and PROSITE databases to identify Nudix hydrolases encoded by the Arabidopsis genome. 25 Nudix hydrolases were identified and classified into 11 individual families by pairwise sequence alignments. Intron phases were strikingly conserved in each family. Phylogenetic analysis showed that all multimember families formed monophyletic clusters. Conserved familial sequence motifs were identified with the MEME motif analysis algorithm. One motif (motif 4) was found in three diverse families. All proteins containing motif 4 demonstrated a degree of preference for substrates containing an ADP moiety. We conclude that HMM model-based genome scanning and MEME motif analysis, respectively, can significantly improve the identification and assignment of function of new members of this mechanistically-diverse protein superfamily

    Dynamic sorted neighborhood indexing for real-time entity resolution

    Get PDF
    Real-time Entity Resolution (ER) is the process of matching query records in subsecond time with records in a database that represent the same real-world entity. Indexing techniques are generally used to efficiently extract a set of candidate records from the database that are similar to a query record, and that are to be compared with the query record in more detail. The sorted neighborhood indexing method, which sorts a database and compares records within a sliding window, has been successfully used for ER of large static databases. However, because it is based on static sorted arrays and is designed for batch ER that resolves all records in a database rather than resolving those relating to a single query record, this technique is not suitable for real-time ER on dynamic databases that are constantly updated. We propose a tree-based technique that facilitates dynamic indexing based on the sorted neighborhood method, which can be used for real-time ER, and investigate both static and adaptive window approaches. We propose an approach to reduce query matching times by precalculating the similarities between attribute values stored in neighboring tree nodes. We also propose a multitree solution where different sorting keys are used to reduce the effects of errors and variations in attribute values on matching quality by building several distinct index trees. We experimentally evaluate our proposed techniques on large real datasets, as well as on synthetic data with different data quality characteristics. Our results show that as the index grows, no appreciable increase occurs in both record insertion and query times, and that using multiple trees gives noticeable improvements on matching quality with only a small increase in query time. Compared to earlier indexing techniques for real-time ER, our approach achieves significantly reduced indexing and query matching times while maintaining high matching accuracy

    Identification of functional domains in Arabidopsis thaliana mRNA decapping enzyme (AtDcp2)

    Get PDF
    The Arabidopsis thaliana decapping enzyme (AtDcp2) was characterized by bioinformatics analysis and by biochemical studies of the enzyme and mutants produced by recombinant expression. Three functionally significant regions were detected: (i) a highly disordered C-terminal region with a putative PSD-95, Discs-large, ZO-1 (PDZ) domain-binding motif, (ii) a conserved Nudix box constituting the putative active site and (iii) a putative RNA binding domain consisting of the conserved Box B and a preceding loop region. Mutation of the putative PDZ domain-binding motif improved the stability of recombinant AtDcp2 and secondary mutants expressed in Escherichia coli. Such recombinant AtDcp2 specifically hydrolysed capped mRNA to produce 7-methyl GDP and decapped RNA. AtDcp2 activity was Mn2+- or Mg2+-dependent and was inhibited by the product 7-methyl GDP. Mutation of the conserved glutamate-154 and glutamate-158 in the Nudix box reduced AtDcp2 activity up to 400-fold and showed that AtDcp2 employs the catalytic mechanism conserved amongst Nudix hydrolases. Unlike many Nudix hydrolases, AtDcp2 is refractory to inhibition by fluoride ions. Decapping was dependent on binding to the mRNA moiety rather than to the 7-methyl diguanosine triphosphate cap of the substrate. Mutational analysis of the putative RNA-binding domain confirmed the functional significance of an 11-residue loop region and the conserved Box B

    Assessment of JSBACHv4.30 as a land component of ICON-ESM-V1 in comparison to its predecessor JSBACHv3.2 of MPI-ESM1.2

    Get PDF
    We assess the land surface model JSBACHv4 (Jena Scheme for Biosphere Atmosphere Coupling in Hamburg version 4), which was recently developed at the Max Planck Institute for Meteorology as part of the effort to build the new Icosahedral Nonhydrostatic (ICON) Earth system model (ESM), ICON-ESM. We assess JSBACHv4 in simulations coupled with ICON-A, the atmosphere model of ICON-ESM, hosting JSBACHv4 as land component to provide the surface boundary conditions. The assessment is based on a comparison of simulated albedo, land surface temperature (LST), leaf area index (LAI), terrestrial water storage (TWS), fraction of absorbed photosynthetic active radiation (FAPAR), net primary production (NPP), and water use efficiency (WUE) with corresponding observational data. JSBACHv4 is the successor of JSBACHv3; therefore, another purpose of this study is to document how this step in model development has changed model biases. This is achieved by also assessing, in parallel, the results of coupled land-atmosphere simulations with the preceding model ECHAM6 hosting JSBACHv3. Large albedo biases appear in both models over ice sheets and in central Asia. The temperate to boreal warm bias observed in simulations with JSBACHv3 largely remained in JSBACHv4, despite the very good agreement with observed LST in the global mean. For the assessment of changes in land water storage, a novel procedure is suggested to compare the gravitational data from the Gravity Recovery And Climate Experiment (GRACE) satellites to simulated TWS. It turns out that the agreement of the changes in the seasonal cycle of TWS is sensitive to the representation of precipitation in the atmosphere model. The LAI is generally too high, which is partly caused by too high soil moisture and also by the parameterization of the phenology itself. The pattern of WUE is, for both models, largely as observed. In India, WUE is too high, probably because JSBACH does not incorporate irrigation in our simulations. WUE differences between the two models can be traced back to differences in precipitation patterns in the two coupled land-atmosphere simulations. For both models, most NPP biases can be associated with biases in water stress, LAI, and FAPAR. In particular, the NPP bias of the Eurasian steppes has switched from positive in JSBACHv3 to negative in JSBACHv4. This difference is mainly caused by weaker precipitation and lower FAPAR of ICON-A-JSBACHv4 in July, which is most probably caused by a feedback loop between too little soil moisture, evaporation, and clouds. While the size and patterns of biases in albedo and LST are largely similar between the two model versions, they are less well correlated for precipitation- and vegetation-related variables like FAPAR. Overall, the biases found in the different assessment variables are either already known from the previous implementation in the Max Planck Institute Earth System Model (MPI-ESM) or have changed because of the coupling with the new atmospheric component ICON-A. Accordingly, this study demonstrates the technically successful completion of the re-implementation of JSBACH into ICON-ESM-V1. As discussed, there is a good perspective on mitigating the biases by an improved representation of the processes

    Lateral Movement of Water and Sugar Across Xylem in Sugarcane Stalks

    Full text link
    corecore